Inducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information
نویسندگان
چکیده
The paper describes the application of kMeans, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated, handconstructed semantic verb classes. A series of post-hoc cluster analyses explored the influence of specific frames and frame groups on the coherence of the verb classes, and supported the tight connection between the syntactic behaviour of the verbs and their lexical meaning components.
منابع مشابه
Inferring and evaluating semantic classes of verbs signaling modality
We infer semantic classes of verbs signaling modality from a purely syntactic classification of 637 German verbs by applying findings from linguistics about correspondences between verb meaning and syntax. Our extensive evaluation of the semantic classification is based on a linking to three other lexical resources at the word sense level: to the German wordnet GermaNet and to the English resou...
متن کاملSpectral Clustering for German Verbs
We describe and evaluate the application of a spectral clustering technique (Ng et al., 2002) to the unsupervised clustering of German verbs. Our previous work has shown that standard clustering techniques succeed in inducing Levinstyle semantic classes from verb subcategorisation information. But clustering in the very high dimensional spaces that we use is fraught with technical and conceptua...
متن کاملDistributional models of verb meaning: syntactic versus lexical contexts
Over the last decade or so, distributional methods have become the mainstay of semantic modelling in Computational Linguistics. As such, they have also been applied the automatic modelling of verb meaning. However, more than with other lexical categories, the research into verb semantics has taken its inspiration from the idea that a verb's meaning is strongly linked to its syntactic behaviour ...
متن کاملA Subcategorisation Lexicon for German Verbs induced from a Lexicalised PCFG
The paper presents a large-scale computational subcategorisation lexicon for several thousand German verbs. The lexical entries were obtained by unsupervised learning in a statistical grammar framework: a German context-free grammar containing frame-predicting grammar rules and information about lexical heads was trained on 18.7 million words of a large German newspaper corpus. We developed a s...
متن کاملA Large Coverage Verb Taxonomy for Arabic
In this article I present a lexicon for Arabic verbs which exploits Levin’s verb-classes (Levin, 1993) and the basic development procedure used by (Schuler, 2005). The verb lexicon in its current state has 173 classes which contain 4392 verbs and 498 frames providing information about verb root, the deverbal form of the verb, the participle, thematic roles, subcategorisation frames and syntacti...
متن کامل